Efficient Mining Differential Co-Expression Constant Row Bicluster in Real-Valued Gene Expression Datasets

نویسندگان

  • Miao Wang
  • Xuequn Shang
  • Zhanhuai Li
  • Wenbin Liu
چکیده

Biclustering aims to mine a number of co-expressed genes under a set of experimental conditions in gene expression dataset. Recently, differential co-expression biclustering approach has been used to identify class-specific biclusters between two gene expression datasets. However, it cannot handle differential co-expression constant row biclusters efficiently in real-valued datasets. In this paper, we propose an algorithm, DRCluster, to identify Differential co-expression constant Row biCluster in two real-valued gene expression datasets. Firstly, DRCluster infers the differential co-expressed genes from each pair of samples in two real-valued gene expression datasets, and constructs a differential weighted undirected sample-sample relational graph. Secondly, the differential coexpression constant row biclusters are produced in the above differential weighted undirected sample-sample relational graph. We also design several pruning techniques for mining maximal differential co-expression constant row biclusters without candidate maintenance. The experimental results show our algorithm is more efficient than existing one. The performance of DRCluster is evaluated by MSE score and Gene Ontology, the results show our algorithm can find more significant and biological differential biclusters than traditional algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BIDENS: Iterative Density Based Biclustering Algorithm With Application to Gene Expression Analysis

Biclustering is a very useful data mining technique for identifying patterns where different genes are co-related based on a subset of conditions in gene expression analysis. Association rules mining is an efficient approach to achieve biclustering as in BIMODULE algorithm but it is sensitive to the value given to its input parameters and the discretization procedure used in the preprocessing s...

متن کامل

Efficient Mining Maximal Subspace Differential Co-expression Patterns in Matrix Datasets: a General Earthquake Analysis Approach

The electromagnetic anomaly observations before earthquake, have been confirmed by many cases of strong earthquakes. The analysis of earthquake magnetic anomaly is an effective approach for seismo-precursor detection. Traditional frequent mining methods for electromagnetic matrix datasets analysis often find the co-related items. However, these methods may miss the items which are differential ...

متن کامل

Randomization methods for assessing data analysis results on real-valued matrices

Randomization is an important technique for assessing the significance of data analysis results. Given an input dataset, a randomization method samples at random from some class of datasets that share certain characteristics with the original data. The measure of interest on the original data is then compared to the measure on the samples to assess its significance. For certain types of data, e...

متن کامل

Application of Cardinality based GRASP to the Biclustering of Gene Expression Data

Biclustering algorithms perform simultaneous row and column clustering of a given data matrix. In gene expression dataset a bicluster is a subset of genes that exhibit similar expression patterns through a subset of conditions. Biclustering is a useful data mining technique for identifying local patterns from gene expression data. In this paper biclusters are identified in two steps. In the fir...

متن کامل

Randomization of real-valued matrices for assessing the significance of data mining results

Randomization is an important technique for assessing the significance of data mining results. Given an input data set, a randomization method samples at random from some class of datasets that share certain characteristics with the original data. The measure of interest on the original data is then compared to the measure on the samples to assess its significance. For certain types of data, e....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012